Understanding Data Races in MySQL
نویسندگان
چکیده
Data races are notorious for their close relationship to many painful concurrency bugs that are very difficult to locate. On the other hand, it is both impossible and unnecessary to forbid every data race in large software systems, by locking every shared variable. It is impossible since we cannot afford for the performance deterioration by locking everything. It is unnecessary since most of the data races are actually benign, i.e., they do not compromise program’s correctness. In this paper, we study the behavior of data races inside MySQL, a modern database management system. We collect a large set of data races reported by Helgrind, after running a benchmark on MySQL that contains SQL query workloads coming from typical database applications. We then carefully examine these races and work out a taxonomy on programming paradigms that can cause data races. To learn the semantics implied in these paradigms, we further study the roles of the shared variables that are involved in the races, based on their read/write histories. Our findings in this paper will be helpful for programmers to get a deeper understanding on how to write more reliable multithreading programs.
منابع مشابه
Bypassing Races in Live Applications with Execution Filters
Deployed multithreaded applications contain many races because these applications are difficult to write, test, and debug. Worse, the number of races in deployed applications may drastically increase due to the rise of multicore hardware and the immaturity of current race detectors. LOOM is a “live-workaround” system designed to quickly and safely bypass application races at runtime. LOOM provi...
متن کاملOutput-Deterministic Replay for Multicore Debugging
Reproducing bugs is hard. Deterministic replay systems aim to address this problem, by providing a high-fidelity replica of an original program execution that can be repeatedly executed to zero-in on bugs. Unfortunately, existing replay systems for multiprocessor programs fall short. These systems either incur high overheads, rely on nonstandard multiprocessor hardware, or fail to reliably repr...
متن کاملData Mining with R: Learning with Case Studies
Covers the main data mining techniques through carefully selected case studies Describes code and approaches that can be easily reproduced or adapted to your own problems Requires no prior experience with R Includes introductions to R and MySQL basics Provides a fundamental understanding of the merits, drawbacks, and analysis objectives of the data mining techniques Offers data and ...
متن کاملUse of Graph Database for the Integration of Heterogeneous Biological Data
Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to na...
متن کاملRecovery Principles in MySQL Cluster 5.1
MySQL Cluster is a parallel main memory database. It is using the normal MySQL software with a new storage engine NDB Cluster. MySQL Cluster 5.1 has been adapted to also handle fields on disk. In this work a number of recovery principles of MySQL Cluster had to be adapted to handle very large data sizes. The article presents an efficient algorithm for synchronizing a starting node with very lar...
متن کامل